11 December, 2006

The statistics of predicting violence in mental patients

Improved version

A few weeks ago Ben Goldacre wrote an interesting article in his Bad Science column in the Guardian about the statistics of predicting violence in mentally disturbed subjects. Like most of his articles he repeated it on his blog here

His point was that the statistics of testing for medical conditions, e.g. in HIV tests, is also applicable to testing for potentially violent subjects with mental health problems; and this has consequences for how successful we are at predicting whether subjects will be violent or not. Like most of his stuff, it is good article; but it does rely on the assumption that the statistical model for HIV testing works for predicting violence in mental patients.

I don’t know much about medical statistics, especially mental health statistics, but I do know the odd thing about statistics in general, and it is not obvious to me that this assumption holds.

I will try to explain. Ben covers the first bit better than I ever could - but I am repeating it in my own words to save readers having to shoot across to his blog and back.

When medical statisticians discuss tests such as HIV tests, or mammography tests for breast cancer, they try to determine two fundamental parameters of the test:

Sensitivity how likely is the test to be positive if you have the condition e.g. if you have HIV, what is the probability that the HIV test will indicate that you have got it?

Specifivity how likely is the test to be negative if you don't have the condition e.g. if you don't have HIV, what is the probability that the HIV test will indicate you don't have it?

Now suppose you have no idea whether you have HIV or not and you take an HIV test. The thing that is most likely to interest you is:

if the test is positive, what are the chances I actually have HIV?

This is called the "positive predictive value (PPV)" - what is the value of a positive result in predicting the underlying condition.

It turns out that you can't calculate the PPV from the specificity and sensitivity alone; you have to estimate one other value: what is the prevalence of the condition, e.g. HIV, in the community i.e. how many people on average have HIV. You have to know what proportion of the people with positive test results are from the group who don't have the condition but showed positive, and what proportion are from the group who do have the condition and showed positive.

So far, all well and good. If you don't follow what I have written then look at Ben's post.

But here is the tricky bit. Ben applies the same reasoning to assessing mental patients for violence. In this case the condition (the equivalent to having HIV or not) is being violent or not; and the test (the equivalent of the HIV test) is an assessment which might be a one of a number of different types of test ranging from expert clinical judgement of high, medium or low risk, through to more mechanical assessments based on the presence or absence of risk factors such as the Violence Risk Appraisal Guide (VRAG). The idea is that you can also apply the concepts of sensitivity and specifity to these conditions and tests.

So in this context:

Sensitivity is given that a subject is violent how likely is the assessment to indicate violence

Specifivity is given a subject is not violent how likely is the assessment to indicate non-violence

And of course what we are really interested in is:

if the assessment indicates the subject will be violent, what are the chances the subject actually will be violent

The two examples seem analogous. But there is a key assumption in the HIV example which is far from obvious in the violence example. To see what this assumption is go back to the HIV example. The question that really interests us is:

- if the test is positive, what are the chances I actually have HIV?

We saw that to calculate this value from the specifity and sensitivity was a bit tedious requiring knowledge of the prevalence of HIV in the community. So why not measure the answer directly? Why don't we just test a large sample from the community in question and see how many of those with positive results actually get HIV?

I can think of two reasons:

* There is a practical problem. In most communities very few people have HIV. So very people in random sample will test positive. So it would be necessary to get a very large sample to get enough positive tests to be useful. It would then be necessary to follow those cases for some time to see whether they actually had HIV.

* There is also a deeper and subtler problem. The PPV for an HIV test is dependent on the prevalence of HIV in the population. So even if we were to measure the PPV for an test for one community, this figure would be of little use in another community e.g. some parts of Africa

The preferred approach is to measure the sensitivity and specificity because these are reasonably easy to measure (get a sample of known HIV patients and give them test, and a sample of known non-HIV patients and give them the test as well) but also they are values that are unlikely to change with different communities. So once we have established the values we only need to estimate the prevalence of HIV in the community and we can estimate the PPV.

Notice the key assumption - the sensitivity and specificity are unlikely to change with different communities and in particular with the prevalence of HIV in the community. That's what makes the model useful. What's the equivalent assumption in the case of predicting violence? It is:

"The probability of a person who is violent being assessed as violent is reasonably constant across different communities and is independent of the prevalence of violence in the population"

and

"The probability of a person who is not violent being assessed as not violent is reasonably constant across different communities and is independent of the prevalence of violence in the population"

Why should we believe these assumptions to be true? In the case of HIV testing the assumptions made intuitive sense because the presence of HIV is causally related to the result of the test. But the act of violence does not cause an assessment result - in fact it happens afterwards. Suppose the assessment is based on a factor such as "regular substance abuse". We discover that 25 out every 100 people with substance abuse become violent. We also discover that there are other factors e.g. a low level of educational attainment that are good predictors for violence - but not so strong. Then in a community where substance abuse is common, if someone is violent there is a high chance they will also be a regular substance abuser. In a community where substance abuse is rare but educational attainment is low then if someone is violent then the probability of them being a regular substance abuser will be lower. i.e. sensitivity varies from one community to another.

Now remember what we are trying to achieve. Ideally we want the PPV. The probability of being violent given the assessment is positive. In the case of HIV we couldn't measure this directly because it varied according to the prevalence of HIV. But in the case of mental patients and violence it seems quite plausible that the probability of being violent given a positive assessment is reasonably constant across populations, or at least independent of the prevalence of violence. This is equivalent to saying that the probability of being violent given a specific profile, e.g. substance abuse, is fairly constant across communities.

Which is the right model for mental health? I don't know – but I don’t think it is obvious that it is the HIV model and that’s why I am asking the question.

It shouldn’t be hard to find the answer. It is basic statistics to test whether the assumptions of your model hold and it seems to me it is quite possible to do it in this case (someone may have done so). The key assumption for model 2 is that Pr(violent act|assessment result) is approximately constant and not affected by the prevalence of one type of assessment result. This assumption can be tested. It would hold if studies showed that the level of violence given an assessment result was pretty much the same in a large range of studies. However, I haven't found any such study to date. I am hoping that someone with detailed knowledge of the field can supply the answer.